Author: Nicolas Legrand nicolas.legrand@cfin.au.dk

%%capture
import sys
if 'google.colab' in sys.modules:
    ! pip install systole
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from systole.detection import interpolate_clipping, ecg_peaks
from systole.plots import plot_raw
from systole import import_dataset1, import_ppg
from systole import serialSim
from systole.recording import Oximeter

from bokeh.io import output_notebook
from bokeh.plotting import show
output_notebook()

sns.set_context('talk')
Loading BokehJS ...

Detecting cardiac cycles#

This notebook focuses on the characterisation of cardiac cycles from the physiological signals we previously described (PPG and ECG). Here we only briefly mention different QRS-detection algorithms as related to the estimation of heart rate frequency, which is a commonly used measure in psychology and cognitive science. However, both the ECG and the PPG signals contain rich information beyond the beat to beat frequency that is relevant for cognitive and physiological modelling. All these approaches are not fully covered by Systole, but are implemented in other software (e.g see [1] for ECG components delineation).

In this notebook, we are going to review the peak detection algorithm, which future tutorials covering, for example, heart-rate variability will build upon. Here our intention is to provide a better intuition of what is happening in our peak detection algorithms, the corrections that can be applied, and the possible bias and artefacts that can emerge. However, you do not need a perfect understanding of all these steps if you want to apply peak detection to youre data, and you might also consider skipping this part to proceed directly to the next tutorial focusing on artefact detection and correction.

Electrocardiography#

Because we will ultimately be interested in heart rate and its variability, our first goal will be to detect the R peaks. We will use the ecg_peaks() function provided by Systole to perform R peak detection. This function is a simple wrapper for well-known peak detection algorithms that are implemented in the py-ecg-detectors module [2]. The detection algorithm can be selected via the ecg_method parameter, it should be among the following: hamilton, christov, engelse-zeelenberg, pan-tompkins, wavelet-transform, moving-average. In this tutorial, we will use the pan-tompkins algorithm [3] as it is a fast, well-perfoming, and commonly used algorithm for QRS detection.

Let’s first load an ECG recording. Here, we will select a 5 minute interval and compare the performances of the different algorithms supported by Systole.

# Import ECg recording
ecg_df = import_dataset1(modalities=['ECG'], disable=True)
signal = ecg_df[ecg_df.time.between(60, 360)].ecg.to_numpy()  # Select 5 minutes

Detecting R peaks#

The main feature that we can extract from the ECG recording is the R wave (see image in notebook 1). A large variety of algorithms have been proposed to extract the timing of R waves while controlling for signal noise and physiological variability. Reviewing and comparing all of the available methods is beyond the scope of this tutorial. Here, we are going to restrict our focus to some of the most popular methods, which also are available in the Python opensource ecosystem (hamilton, christov, engelse-zeelenberg, pan-tompkins, and moving-average. These methods were implemented originally in the py-ecg-detectors module - Systole includes a modified version that runs with the Numba package for better performance, resulting in 7-30x faster estimation, depending on the algorithm.

from systole.detectors import pan_tompkins, hamilton, moving_average, christov, engelse_zeelenberg

Pan-Tompkins#

The Pan-Tompkins algorithm was introduced in 1985. It uses band-pass filtering and derivation to identify the QRS complex and apply an adaptive threshold to detect the R peaks on the filtered signal.

This is a very popular - maybe to most popular - method for R peaks detection. One of the advantages is that it can easily be applied for online detection (see this implementation for example). As we can see in the timing report, this algorithm is also the fastest among those available in the Systole toolbox, making ideal for applications such as online peak detection and biofeedback. You can also see that the algorithm’s peak detection is highly accurate as we do not see any artifacts in the estimated R-R time series. One notable exception is the first peak which has been estimated incorrectly. This is due to the filtering steps in the Pan-Tompkins algorithms, and can be corrected by using a slightly longer signal than the range of interest.

%%timeit
peaks_pt = pan_tompkins(signal, sfreq=1000)
8.12 ms ± 699 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', ecg_method='pan-tompkins', backend='bokeh', show_heart_rate=True)
)

Moving average#

%%timeit
peaks_wa = ecg_peaks(signal, sfreq=1000, method="moving-average")
7.69 ms ± 343 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', ecg_method='moving-average', backend='bokeh', show_heart_rate=True)
)

Hamilton#

%%timeit
peaks_ha = hamilton(signal, sfreq=1000)
11.1 ms ± 885 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', ecg_method='hamilton', backend='bokeh', show_heart_rate=True)
)

Christov#

%%timeit
peaks_ch = christov(signal, sfreq=1000)
144 ms ± 818 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', ecg_method='christov', backend='bokeh', show_heart_rate=True)
)

Engelse-Zeelenberg#

%%timeit
peaks_ew = engelse_zeelenberg(signal, sfreq=1000)
57 ms ± 201 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
show(
    plot_raw(signal, modality='ecg', ecg_method='engelse-zeelenberg', backend='bokeh', show_heart_rate=True)
)

Photoplethysmography#

Systolic peaks detection#

Online#

Let’s first simulate some PPG recording. Here, we will use the serialSim class from Systole in conjunction with the Oximeter recording class. This will simulate the presence of a pulse oximeter on the computer and provide the information from a pre-recorded signal in real time. Simulating synthetic cardiac data in this way can be a great way to test your data analysis pipelines before collecting data - avoiding painful mistakes that can occur!

# Pulse oximeter simulation
ser = serialSim()

# Create an Oxymeter instance, initialize recording and record for 10 seconds
oxi = Oximeter(serial=ser, sfreq=75).setup()
oxi.read(20);
Reset input buffer
# Retrieve the data from the oxi class
times = np.array(oxi.times)
threshold = np.array(oxi.threshold)
recording = np.array(oxi.recording)
peaks = np.array(oxi.peaks)

This method uses the derivative to find peaks in the signal and select them based on and adaptive threshold based on the rolling mean and rolling standard deviation in a given time window.

fig, ax = plt.subplots(figsize=(15, 5))
ax.set_title("Oximeter recording")

ax.plot(times, threshold, linestyle="--", color="gray", label="Threshold", linewidth=1.5)
ax.fill_between(
    x=times, y1=threshold, y2=recording.min(), alpha=0.2, color="gray"
)
ax.plot(times, recording, label="Recording", color="#4c72b0", linewidth=1.5)
ax.fill_between(x=times, y1=recording, y2=recording.min(), color="w")
ax.scatter(x=times[np.where(peaks)[0]], y=recording[np.where(peaks)[0]],
           color="#c44e52", alpha=.6, label="Online estimation",
           edgecolors='k')
ax.set_ylabel("PPG level")
ax.set_xlabel("Time (s)")
ax.legend()
sns.despine()
../_images/f6d9d986fb40c369c1fa4b902e8e8b8007de3b626a3c22e8d5b543174de0aee9.png

Offline#

A simple online approach like the one we described is usually good enough to detect all the systolic peaks, provided that the subject is not moving too much.

ppg = import_ppg()

Clipping artefacts

Clipping is a form of distortion that can limit signal when it exceeds a certain threshold see the Wikipedia page. Some PPG devices can produce clipping artefacts when recording the PPG signal, for example if a participant is pressing too hard on the fingerclip. Here, we can see that some clipping artefacts are found between 100 and 150 seconds in the previous recording. The threshold values (here 255), is often set by the device and can easily be found depending on the manufacturer. These artefacts should be corrected before systolic peaks detection. One way to correct these artefacts is to remove the portion of the signal where clipping artefacts occurs and use cubic spline interpolation to reconstruct a plausible estimate of the real underlying signal. This is what the function interpolate_clipping() does [4].

signal = ppg[ppg.time.between(110, 113)].ppg.to_numpy()  # Extract a portion of signal with clipping artefacts
clean_signal = interpolate_clipping(signal, min_threshold=0, max_threshold=255)  # Remove clipping segment and interpolate missing calues
plt.figure(figsize=(15, 5))
plt.plot(np.arange(0, len(clean_signal))/75, clean_signal, label='Corrected PPG signal', linestyle= '--', color='#c44e52')
plt.plot(np.arange(0, len(signal))/75, signal, label='Raw PPG signal', color='#4c72b0')
plt.legend()
plt.xlabel('Time (s)')
plt.ylabel('PPG level (a.u)')
sns.despine()
plt.tight_layout()
../_images/553a80a2dd604bb29aa7c63f2e59d2654dad39ad0b4a522ede4d197fd73015f2.png

This concludes the tutorial on using Systole to detect individual heartbeats and to reject common ECG and PPG artefacts. The next tutorial will go further into more nuanced issues in artefact detection and correction.

References

[1] Makowski, D., Pham, T., Lau, Z. J., Brammer, J. C., Lespinasse, F., Pham, H., Schölzel, C., & Chen, S. A. (2021). NeuroKit2: A Python toolbox for neurophysiological signal processing. Behavior Research Methods. https://doi.org/10.3758/s13428-020-01516-y P, Q, S and T waves detection in the ECG signal. See also this tutorial: https://neurokit2.readthedocs.io/en/latest/examples/ecg_delineate.html

[2] Porr, B., & Howell, L. (2019). R-peak detector stress test with a new noisy ECG database reveals significant performance differences amongst popular detectors. Cold Spring Harbor Laboratory. https://doi.org/10.1101/722397 Python implementation of famous R peak detection algorithms tested against large dataset.

[3] Pan, Jiapu; Tompkins, Willis J. (March 1985). “A Real-Time QRS Detection Algorithm”. IEEE Transactions on Biomedical Engineering. BME-32 (3): 230–236. doi:10.1109/TBME.1985.325532 This paper describes the first implementation of the Pan-Tompkins algorithm

[4] van Gent, P., Farah, H., van Nes, N., & van Arem, B. (2019). HeartPy: A novel heart rate algorithm for the analysis of noisy signals. Transportation Research Part F: Traffic Psychology and Behaviour, 66, 368–378. https://doi.org/10.1016/j.trf.2019.09.015 This paper describes of a simple systolic peak detection algorithm and clipping artefacts correction.